传统的脑电脑接口(BCI)需要在使用之前为每个用户提供完整的数据收集,训练和校准阶段。近年来,已经开发了许多主题独立的(SI)BCI。与受试者依赖性(SD)方法相比,这些方法中的许多方法产生较弱的性能,有些方法是计算昂贵的。潜在的真实世界应用程序将极大地受益于更准确,紧凑,并计算高效的主题的BCI。在这项工作中,我们提出了一个名为CCSPNET(卷积公共空间模式网络)的新型主题独立的BCI框架,该框架被训练在大型脑电图(EEG)信号数据库中的电动机图像(MI)范例上,由400个试验组成每54名科目执行两班手机MI任务。所提出的框架应用小波核卷积神经网络(WKCNN)和时间卷积神经网络(TCNN),以表示和提取EEG信号的光谱特征。对于空间特征提取来实现公共空间模式(CSP)算法,并且通过密集的神经网络减少了CSP特征的数量。最后,类标签由线性判别分析(LDA)分类器确定。 CCSPNET评估结果表明,可以具有紧凑的BCI,可实现与复杂和计算昂贵的模型相当的SD和SI最先进的性能。
translated by 谷歌翻译
尽管对抗性和自然训练(AT和NT)之间有基本的区别,但在方法中,通常采用动量SGD(MSGD)进行外部优化。本文旨在通过研究AT中外部优化的忽视作用来分析此选择。我们的探索性评估表明,与NT相比,在诱导较高的梯度规范和方差。由于MSGD的收敛速率高度取决于梯度的方差,因此这种现象阻碍了AT的外部优化。为此,我们提出了一种称为ENGM的优化方法,该方法将每个输入示例对平均微型批次梯度的贡献进行正规化。我们证明ENGM的收敛速率与梯度的方差无关,因此适合AT。我们介绍了一种技巧,可以使用有关梯度范围W.R.T.规范的相关性的经验观察来降低ENGM的计算成本。网络参数和输入示例。我们对CIFAR-10,CIFAR-100和Tinyimagenet的广泛评估和消融研究表明,Engm及其变体一致地改善了广泛的AT方法的性能。此外,Engm减轻了AT的主要缺点,包括强大的过度拟合和对超参数设置的敏感性。
translated by 谷歌翻译
在本文中,我们设计了一种基于生成的对抗网络(GAN)的解决方案,用于视网膜层的光学相干断层扫描(OCT)扫描的超分辨率和分割。 OCT已被确定为成像的一种非侵入性和廉价的模态,可发现潜在的生物标志物,以诊断和进展神经退行性疾病,例如阿尔茨海默氏病(AD)。当前的假设假设在OCT扫描中可分析的视网膜层的厚度可能是有效的生物标志物。作为逻辑第一步,这项工作集中在视网膜层分割的挑战性任务以及超级分辨率的挑战性任务上,以提高清晰度和准确性。我们提出了一个基于GAN的细分模型,并评估合并流行网络,即U-NET和RESNET,在GAN体系结构中,并具有其他转置卷积和子像素卷积的块,以通过将OCT图像从低分辨率提高到高分辨率到高分辨率的任务。四个因素。我们还将骰子损失纳入了额外的重建损失项,以提高该联合优化任务的性能。我们的最佳模型配置从经验上实现了0.867的骰子系数,MIOU为0.765。
translated by 谷歌翻译
光学相干断层扫描(OCT)是未侵入性且易于出现的生物标志物(视网膜层的厚度(可在OCT扫描中可检测到的)),以诊断阿尔茨海默氏病(AD)。这项工作旨在自动细分OCT图像。但是,由于各种问题,例如斑点噪声,小目标区域和不利的成像条件,这是一项具有挑战性的任务。在我们以前的工作中,我们提出了多阶段和多歧视性生成对抗网络(Multisdgan),以在高分辨率分段标签中翻译OCT扫描。在这项调查中,我们旨在评估和比较渠道和空间关注的各种组合与多根式体系结构,以通过捕获丰富的上下文关系以提高细分性能来提取更强大的特征图。此外,我们开发并评估了一个引导的MUTLI阶段注意力框架,在该框架中,我们通过在专门设计的二进制掩码和生成的注意力图之间强迫L-1损失来结合引导的注意机制。我们的消融研究结果在五倍交叉验证(5-CV)中的WVU-OCT数据集结果表明,具有串行注意模块的拟议的多键型提供了最有竞争力的性能,并指导二进制蒙版的空间注意力图。进一步提高了我们提出的网络中的性能。将基线模型与添加指导性注意事项进行比较,我们的结果表明,骰子系数和SSIM的相对改善分别为21.44%和19.45%。
translated by 谷歌翻译
我们提出了一种质量感知的多模式识别框架,其将来自多个生物特征的表示与不同的质量和样本数量相结合,以通过基于样本的质量提取互补识别信息来实现增加的识别准确性。我们通过使用以弱监督时尚估计的质量分数加权,为融合输入方式的质量意识框架,以融合输入方式的融合。此框架利用两个融合块,每个融合块由一组质量感知和聚合网络表示。除了架构修改外,我们还提出了两种特定于任务特定的损耗功能:多模式可分离性损失和多模式紧凑性损失。第一个损失确保了类的模态的表示具有可比的大小来提供更好的质量估计,而不同类别的多式数代表分布以实现嵌入空间中的最大判别。第二次丢失,被认为是正规化网络权重,通过规范框架来提高泛化性能。我们通过考虑由面部,虹膜和指纹方式组成的三个多模式数据集来评估性能。通过与最先进的算法进行比较来证明框架的功效。特别是,我们的框架优于BioMdata的模式的级别和得分级别融合超过30%以获得$ 10 ^ { - 4} $ 10 ^ { - 4} $的真正验收率。
translated by 谷歌翻译
本文介绍了SuperMix的监督混合增强方法,它利用输入图像内的突出区域来构建混合训练样本。 SuperMix旨在获得丰富的视觉特征的混合图像,并符合现实的图像前提。为了提高算法的效率,我们开发了牛顿迭代方法的变种,比这个问题的渐变血管更快65美元。我们通过广泛的评估和消融研究验证SuperMix的有效性和对象分类和知识蒸馏的两个任务。在分类任务上,SuperMix为高级增强方法提供了可比的性能,例如自动化和randaugment。特别是,将SuperMix与Randaugment组合实现了78.2 \%在ImageNet上实现了78.2 \%的前1个精度。在蒸馏任务上,单独对使用教师知识混合的图像进行分类,实现了最先进的蒸馏方法的可比性。此外,平均地,将混合图像掺入蒸馏物物镜中,分别在CiFar-100和Imagenet上提高了3.4×%和3.1±1%的性能。 {\它的代码是在https://github.com/alldbi/supermix}上获得的。
translated by 谷歌翻译
Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation.
translated by 谷歌翻译
In this paper, we reduce the complexity of approximating the correlation clustering problem from $O(m\times\left( 2+ \alpha (G) \right)+n)$ to $O(m+n)$ for any given value of $\varepsilon$ for a complete signed graph with $n$ vertices and $m$ positive edges where $\alpha(G)$ is the arboricity of the graph. Our approach gives the same output as the original algorithm and makes it possible to implement the algorithm in a full dynamic setting where edge sign flipping and vertex addition/removal are allowed. Constructing this index costs $O(m)$ memory and $O(m\times\alpha(G))$ time. We also studied the structural properties of the non-agreement measure used in the approximation algorithm. The theoretical results are accompanied by a full set of experiments concerning seven real-world graphs. These results shows superiority of our index-based algorithm to the non-index one by a decrease of %34 in time on average.
translated by 谷歌翻译
This paper proposes a novel self-supervised based Cut-and-Paste GAN to perform foreground object segmentation and generate realistic composite images without manual annotations. We accomplish this goal by a simple yet effective self-supervised approach coupled with the U-Net based discriminator. The proposed method extends the ability of the standard discriminators to learn not only the global data representations via classification (real/fake) but also learn semantic and structural information through pseudo labels created using the self-supervised task. The proposed method empowers the generator to create meaningful masks by forcing it to learn informative per-pixel as well as global image feedback from the discriminator. Our experiments demonstrate that our proposed method significantly outperforms the state-of-the-art methods on the standard benchmark datasets.
translated by 谷歌翻译
Machine learning models are typically evaluated by computing similarity with reference annotations and trained by maximizing similarity with such. Especially in the bio-medical domain, annotations are subjective and suffer from low inter- and intra-rater reliability. Since annotations only reflect the annotation entity's interpretation of the real world, this can lead to sub-optimal predictions even though the model achieves high similarity scores. Here, the theoretical concept of Peak Ground Truth (PGT) is introduced. PGT marks the point beyond which an increase in similarity with the reference annotation stops translating to better Real World Model Performance (RWMP). Additionally, a quantitative technique to approximate PGT by computing inter- and intra-rater reliability is proposed. Finally, three categories of PGT-aware strategies to evaluate and improve model performance are reviewed.
translated by 谷歌翻译